Neural Oscillations in Speech: Don't be Enslaved by the Envelope
نویسندگان
چکیده
In a recent “Perspective” article (Giraud and Poeppel, 2012), Giraud and Poeppel lay out in admirable clarity how neural oscillations and, in particular, nested oscillations at different time scales, might enable the human brain to understand speech. They provide compelling evidence for “enslaving” of ongoing neural oscillations by slow fluctuations in the amplitude envelope of the speech signal, and propose potential mechanisms for how slow theta and faster gamma oscillatory networks might work together to enable a concerted neural coding of speech. This model is unparalleled in its fruitful incorporation of state-of-the-art computational models and neurophysiology (e.g., the intriguing pyramidal–interneuron gamma loops, PING – which will unfortunately not be observable in healthy, speechprocessing humans within the near future). The authors propose a scenario focused on theta and gamma, where problems in speech comprehension are sorted out if (and only if) the brain syncs well enough to the amplitude fluctuations of the incoming signal. However, while we enjoy the “perspective” Giraud and Poeppel (2012) are offering, it seems to oversimplify the available evidence in at least three key respects: First, how “slow” is a slow neural oscillation? Although it might be troublesome to reliably record fast, local gamma oscillations outside the skull, we can do so with satisfying precision in the lower-frequency ranges. So, why not allow the model to gain specificity and, accordingly, be specific about the ranges in which effects were observed? Giraud and Poeppel report the range of rates in which amplitude fluctuations in speech occur as 4–7 Hz (p. 511), 1–5 Hz (Figure 2), 5–10 Hz (p. 514, Figure 5), and <10 Hz (p. 514). Moreover, neural “theta” is defined as 1–8 Hz (Figure 1), 4–8 Hz (p. 511), 2–6 Hz (Figure 6), and 8.33 Hz (120 ms, p. 514). Also, they show the most focal coupling of gamma power with the phase of an 8-Hz oscillation – text-book alpha. The trouble is that, if we cut loosely across the boundaries between delta and theta or theta and alpha, we might overlook important functional differentiations between these frequency bands (Klimesch et al., 2007). On the delta–theta end, it has been demonstrated that delta (here: 1.4 Hz) phase covaries with theta (here: 7.8 Hz) oscillatory power in macaque auditory cortex (Lakatos et al., 2005), at least implying that theta oscillations themselves are slaves to lower-frequency masters. On the theta–alpha end, auditory evoked perturbations hint at an intimate, but antagonistic relationship of neural theta and alpha. Independent of the ongoing debate regarding whether the evoked potential reflects an additive brain response or a phase reset of ongoing neural oscillations (for review, see Sauseng et al., 2007), time–frequency representations of auditory evoked brain activity are typically characterized by initially strong phase alignment (i.e., increased phase coherence across trials) that spans across theta as well as alpha frequencies. This is often followed by a dissociation: alpha (>8 Hz) steeply decreases in power, while theta (<7 Hz) power remains high (e.g., Shahin et al., 2009). To sum up this point, Giraud and Poeppel (2012, p. 511) argue for a “principled relation between the time scales present in speech and the time constants underlying neuronal cortical oscillations,” but what if the time scales present in speech cross functional boundaries between oscillatory bands in the human brain? Put simply, if delta vs. theta bands, or theta vs. alpha bands, do subserve discontinuous, separable processing modes in the auditory and speech-processing domain, then further speaking of “slow neural oscillations” will hinder rather than benefit our understanding. Recently, we observed a negative correlation of alpha and theta power in response to speech, and it was the periand poststimulation alpha suppression that indexed best speech comprehension (Obleser and Weisz, 2012). Note that in this study, effects were attained with an intelligibility manipulation that was relying on spectral changes only – envelope changes were less effective in modulating alpha suppression, and did not affect theta power at all. Which leads us to our next point: An over-emphasis of speech envelope. Amplitude envelope and syllable rate are currently very much emphasized in the speech and vocalizations literature (e.g., Luo and Poeppel, 2007; Chandrasekaran et al., 2009; Ghitza and Greenberg, 2009), likely because (a) they are easily quantified, and (b) as outlined above, we are best at measuring relatively low-frequency brain oscillations. Hence, it is tempting to focus on these slow envelope fluctuations. However, the speech envelope is readily obscured in noisy backgrounds and reverberant environments (Houtgast and Steeneken, 1985) and intact spectral content can be used by the listener to at least partially compensate for degraded temporal envelope information (Sheft et al., 2008). Indeed, although the temporal envelope of speech has been shown to be very important for comprehension (e.g., Drullman et al., 1994a,b) there is good evidence that the spectral content of the speech signal is at least as decisive for speech intelligibility (if not more so; Xu et al., 2005; Lorenzi et al., 2006; Luo and Poeppel, 2007; Obleser et al., 2008; Obleser and Weisz, 2012; Scott and Neural oscillations in speech: don’t be enslaved by the envelope
منابع مشابه
Neuronal oscillations and speech perception: critical-band temporal envelopes are the essence
A RECENT OPINION ARTICLE (NEURAL OSCILLATIONS IN SPEECH: do not be enslaved by the envelope. Obleser et al., 2012) questions the validity of a class of speech perception models inspired by the possible role of neuronal oscillations in decoding speech (e.g., Ghitza, 2011; Giraud and Poeppel, 2012). The authors criticize, in particular, what they see as an over-emphasis of the role of temporal sp...
متن کاملNeural Oscillations Carry Speech Rhythm through to Comprehension
A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners' processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope ...
متن کاملEnvelope-based inter-aural time difference localization training to improve speech-in-noise perception in the elderly
Background: Many elderly individuals complain of difficulty in understanding speech in noise despite having normal hearing thresholds. According to previous studies, auditory training leads to improvement in speech-in-noise perception, but these studies did not consider the etiology, so their results cannot be generalized. The present study aimed at investigating the effectiveness of envelope-b...
متن کاملAcoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing
A growing body of research suggests that intrinsic neuronal slow (<10 Hz) oscillations in auditory cortex appear to track incoming speech and other spectro-temporally complex auditory signals. Within this framework, several recent studies have identified critical-band temporal envelopes as the specific acoustic feature being reflected by the phase of these oscillations. However, how this alignm...
متن کاملBrain-speech alignment enhances auditory cortical responses and speech perception.
Asymmetry in auditory cortical oscillations could play a role in speech perception by fostering hemispheric triage of information across the two hemispheres. Due to this asymmetry, fast speech temporal modulations relevant for phonemic analysis could be best perceived by the left auditory cortex, while slower modulations conveying vocal and paralinguistic information would be better captured by...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 6 شماره
صفحات -
تاریخ انتشار 2012